14 research outputs found

    Maximizing Crosstalk-Induced Slowdown During Path Delay Test

    Get PDF
    Capacitive crosstalk between adjacent signal wires in integrated circuits may lead to noise or a speedup or slowdown in signal transitions. These in turn may lead to circuit failure or reduced operating speed. This thesis focuses on generating test patterns to induce crosstalk-induced signal delays, in order to determine whether the circuit can still meet its timing specification. A timing-driven test generator is developed to sensitize multiple aligned aggressors coupled to a delay-sensitive victim path to detect the combination of a delay spot defect and crosstalk-induced slowdown. The framework uses parasitic capacitance information, timing windows and crosstalk-induced delay estimates to screen out unaligned or ineffective aggressors coupled to a victim path, speeding up crosstalk pattern generation. In order to induce maximum crosstalk slowdown along a path, aggressors are prioritized based on their potential delay increase and timing alignment. The test generation engine introduces the concept of alignment-driven path sensitization to generate paths from inputs to coupled aggressor nets that meet timing alignment and direction requirements. By using path delay information obtained from circuit preprocessing, preferred paths can be chosen during aggressor path propagation processes. As the test generator sensitizes aggressors in the presence of victim path necessary assignments, the search space is effectively reduced for aggressor path generation. This helps in reducing the test generation time for aligned aggressors. In addition, two new crosstalk-driven dynamic test compaction algorithms are developed to control the increase in test pattern count. The proposed test generation algorithm is applied to ISCAS85 and ISCAS89 benchmark circuits. SPICE simulation results demonstrate the ability of the alignment-driven test generator to increase crosstalk-induced delays along victim paths

    Jumping through Local Minima: Quantization in the Loss Landscape of Vision Transformers

    Full text link
    Quantization scale and bit-width are the most important parameters when considering how to quantize a neural network. Prior work focuses on optimizing quantization scales in a global manner through gradient methods (gradient descent \& Hessian analysis). Yet, when applying perturbations to quantization scales, we observe a very jagged, highly non-smooth test loss landscape. In fact, small perturbations in quantization scale can greatly affect accuracy, yielding a 0.5−0.8%0.5-0.8\% accuracy boost in 4-bit quantized vision transformers (ViTs). In this regime, gradient methods break down, since they cannot reliably reach local minima. In our work, dubbed Evol-Q, we use evolutionary search to effectively traverse the non-smooth landscape. Additionally, we propose using an infoNCE loss, which not only helps combat overfitting on the small calibration dataset (1,0001,000 images) but also makes traversing such a highly non-smooth surface easier. Evol-Q improves the top-1 accuracy of a fully quantized ViT-Base by 10.30%10.30\%, 0.78%0.78\%, and 0.15%0.15\% for 33-bit, 44-bit, and 88-bit weight quantization levels. Extensive experiments on a variety of CNN and ViT architectures further demonstrate its robustness in extreme quantization scenarios. Our code is available at https://github.com/enyac-group/evol-qComment: arXiv admin note: text overlap with arXiv:2211.0964

    Run-Time Efficient RNN Compression for Inference on Edge Devices

    Full text link
    Recurrent neural networks can be large and compute-intensive, yet many applications that benefit from RNNs run on small devices with very limited compute and storage capabilities while still having run-time constraints. As a result, there is a need for compression techniques that can achieve significant compression without negatively impacting inference run-time and task accuracy. This paper explores a new compressed RNN cell implementation called Hybrid Matrix Decomposition (HMD) that achieves this dual objective. This scheme divides the weight matrix into two parts - an unconstrained upper half and a lower half composed of rank-1 blocks. This results in output features where the upper sub-vector has "richer" features while the lower-sub vector has "constrained features". HMD can compress RNNs by a factor of 2-4x while having a faster run-time than pruning (Zhu &Gupta, 2017) and retaining more model accuracy than matrix factorization (Grachev et al., 2017). We evaluate this technique on 5 benchmarks spanning 3 different applications, illustrating its generality in the domain of edge computing.Comment: Published at 4th edition of Workshop on Energy Efficient Machine Learning and Cognitive Computing for Embedded Applications at International Symposium of Computer Architecture 2019, Phoenix, Arizona (https://www.emc2-workshop.com/isca-19) colocated with ISCA 201

    PerfSAGE: Generalized Inference Performance Predictor for Arbitrary Deep Learning Models on Edge Devices

    Full text link
    The ability to accurately predict deep neural network (DNN) inference performance metrics, such as latency, power, and memory footprint, for an arbitrary DNN on a target hardware platform is essential to the design of DNN based models. This ability is critical for the (manual or automatic) design, optimization, and deployment of practical DNNs for a specific hardware deployment platform. Unfortunately, these metrics are slow to evaluate using simulators (where available) and typically require measurement on the target hardware. This work describes PerfSAGE, a novel graph neural network (GNN) that predicts inference latency, energy, and memory footprint on an arbitrary DNN TFlite graph (TFL, 2017). In contrast, previously published performance predictors can only predict latency and are restricted to pre-defined construction rules or search spaces. This paper also describes the EdgeDLPerf dataset of 134,912 DNNs randomly sampled from four task search spaces and annotated with inference performance metrics from three edge hardware platforms. Using this dataset, we train PerfSAGE and provide experimental results that demonstrate state-of-the-art prediction accuracy with a Mean Absolute Percentage Error of <5% across all targets and model search spaces. These results: (1) Outperform previous state-of-art GNN-based predictors (Dudziak et al., 2020), (2) Accurately predict performance on accelerators (a shortfall of non-GNN-based predictors (Zhang et al., 2021)), and (3) Demonstrate predictions on arbitrary input graphs without modifications to the feature extractor

    Eff ect of participatory women’s groups facilitated by Accredited Social Health Activists on birth outcomes in rural eastern India: a cluster-randomised controlled trial

    Get PDF
    Background A quarter of the world’s neonatal deaths and 15% of maternal deaths happen in India. Few community-based strategies to improve maternal and newborn health have been tested through the country’s government-approved Accredited Social Health Activists (ASHAs). We aimed to test the eff ect of participatory women’s groups facilitated by ASHAs on birth outcomes, including neonatal mortality. Methods In this cluster-randomised controlled trial of a community intervention to improve maternal and newborn health, we randomly assigned (1:1) geographical clusters in rural Jharkhand and Odisha, eastern India to intervention (participatory women’s groups) or control (no women’s groups). Study participants were women of reproductive age (15–49 years) who gave birth between Sept 1, 2009, and Dec 31, 2012. In the intervention group, ASHAs supported women’s groups through a participatory learning and action meeting cycle. Groups discussed and prioritised maternal and newborn health problems, identifi ed strategies to address them, implemented the strategies, and assessed their progress. We identifi ed births, stillbirths, and neonatal deaths, and interviewed mothers 6 weeks after delivery. The primary outcome was neonatal mortality over a 2 year follow up. Analyses were by intention to treat. This trial is registered with ISRCTN, number ISRCTN31567106. Findings Between September, 2009, and December, 2012, we randomly assigned 30 clusters (estimated population 156 519) to intervention (15 clusters, estimated population n=82 702) or control (15 clusters, n=73 817). During the follow-up period (Jan 1, 2011, to Dec 31, 2012), we identifi ed 3700 births in the intervention group and 3519 in the control group. One intervention cluster was lost to follow up. The neonatal mortality rate during this period was 30 per 1000 livebirths in the intervention group and 44 per 1000 livebirths in the control group (odds ratio [OR] 0.69, 95% CI 0·53–0·89). Interpretation ASHAs can successfully reduce neonatal mortality through participatory meetings with women’s groups. This is a scalable community-based approach to improving neonatal survival in rural, underserved areas of India
    corecore